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Abstract 

The problem to establish not only the asymptotic distribution results 
for statistical estimators but also the moment convergence of the estimators 
has been recognized as an important issue in advanced theories of statistics. 
One of the main goals of this paper is to present a metod to derive the 
moment convergence of Z-estimators as it has been done for M-estimators. 
Another goal of this paper is to develop a general, unified approach, based on 
some partial estimation functions which we call "Z-process" , to the change 
point problems for ergodic models as well as some models where the Fisher 
information matrix is random and inhomogeneous in time. Applications to 
some diffusion process models and Cox's regression model are also discussed. 



1 Introduction 

This paper is devoted to the study of two problems both based on the "Z-methods" , 
in other words, some methods using the solutions to estimating equations. Let 
us first describe the outlines of the two themes, and next we shall list up some 
examples to which our results can be applied. 

* Corresponding author. 
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1.1 Theme I: Moment convergence of Z-estimators 

For an illustration, let us consider the simplest case of i.i.d. data. Let (X,A,fJ,) be 
a measure space, and let us be given a parametric family of probability densities 
/(•;#) with respect to /i, where 9 G C M d . Let X 1 ,X 2 ,... be an i.i.d. sequence 
of X- valued random variables from this parametric model. There are two ways 
to define the "maximum likelihood estimator (MLE)" in statistics. One way is to 
define it as the maximum point of the random function 

1 - 

0h+M n (0) = -^log/(X fc ;0), 

k=l 

while the other is to do it as the solution to the estimating equation 

M n (9) = 0, 

where M. n (9) is the gradient vector of M. n (9). The former is a special case of "M- 
estimators" , and the latter is that of "Z-estimators" ; see van der Vaart and Wellner 
(1996) for these terminologies. It may appear from the above introduction that 
M-estimation and Z-estimation can be regarded as "almost equivalent" . However, 
it is not always true as we will discuss below. 

It is well known that the MLE 9 n is asymptotically normal: it holds for any 
bounded continuous function : M. d — > K that 

lim E^(Vn~(9 n - 9 ))} = E[^(I(9 )~ 1 / 2 Z)], 

n— >oa 

where I(9q) is the Fisher information matrix and Z is a standard Gaussian random 
vector. Furthermore, it is important for some advanced theories in statistics, in- 
cluding asymptotic expansions and model selections, to extend this kind of results 
for bounded continuous functions ip to that for any continuous function ip with 
polynomial growth, that is, any continuous function %p for which there exist some 
constants C = > and q = q^ > such that 

\i){x)\ < C(l + 11x11)", Vx G M. d . (1) 

See the discussion in Yoshida (2011) for the importance of this problem. 

We observe that, when we have an asymptotic distribution result of an estima- 
tor, namely R n (9 n — 9 ) -^t d L(9 ) where R n is a (possibly, random) diagonal matrix 
and the limit random vector L(9 Q ) is not necessarily Gaussian, it is sufficient for 
the generalization to the case where ip is a continuous function satisfying (JT} to 
check that \\R n (9 n — ) 1 1 is asymptotically L p -bounded for some p > q, that is, 

limsup E[\\R n (9 n - 9 )\\ p ] < oo. 

n— ¥oo 

The study to provide some methods to obtain the moment convergence with 
polynomial order goes back to Ibragimov and Has'minskii (1981) who considered 
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the MLEs and the Bayes estimators (as some special cases of M-estimators) in the 
general framework of the locally asymptotically normal models. It should be em- 
phasized that one of the important merits of Ibragimov and Has'minskii's program 
is that the theory, based on the likelihood, automatically yields also the asymptotic 
efficiency In their main theorems, it was assumed that an exponential type large 
deviation inequality holds for the rescaled log-likelihood ratio random field. Kutoy- 
ants (1984, 1994, 1998, 2004) sucessfully applied this theory to different stochastic 
process models using some characteristics of the models under consideration such 
as small diffusion models, ergodic diffusion models and Poisson process models. 
However, developing a general theory to establish the large deviation inequality 
had been an open problem for many years. This problem was solved, and the 
results have been published in Yoshida (2011). The paper starts from pointing out 
that a polynomial type large deviation inequality is sufficient for the core part of 
Igragimov and Has'minskii's (1981) program, and then the (polynomial type) large 
deviation inequality has been proved with a good generality. Uchida and Yoshida 
(2012) applied Yoshida's (2011) theory to establish the moment convergence of 
three kinds of adaptive M-estimators in ergodic diffusion process models including 
the one introduced by Kessler (1995). We mention that Nishiyama (2010) pointed 
out that the moment convergence problem for M-estimators can be solved also by 
using a maximal inequality instead of the large deviation inequalities, and that 
Kato (2011) took this type of approach to deal with some bootstrap M-estimators. 

In this paper, we consider the problem of the moment convergence of Z- 
estimators. Since we have to assume that the random field (something like the 
log-likelihood) is differentiable, the framework of Z-estimation is more restrictive 
than that of M-estimation. On the other hand, our proof is a combination of ar- 
guments based only on usual Holder's and Minkowskii's inequalities, and no large 
deviation type inequality appears in our treatment for Z-estimators. 

Another difference between M- and Z-estimations is that in the latter theory 
the case where the rates of convergence are different over the components of 9 can 
be treated easily. This is due to the fact that in the theory of Z-estimation the 
gradient vector h n (8) of a contrast function h n (6), where L n (#) is typically the 
log-likelihood function, can by pre-multiplied by a matrix R~ 2 to get a kind of law 
of large numbers, namely, 

M n (6) = R~ 2 ~L n (8). 

Typically, R n = yjnld where Id is the identity matrix, although in the approach 
to Z-estimation presented here the diagonal components of R n may be different. 
Compare this with the framework of M-estimation where the (scalar valued) con- 
trast function h n (8) with no assumption of differentiability has to be multiplied 
by a scalar. Yoshida (2011) dealt with this point developing an iterative method. 

Thinking of these differences, we may conclude that M-estimation and Z- 
estimation are not "almost equivalent" at least for the moment convergence prob- 
lem. 
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1.2 Theme II: Z-process method for change point problems 

Let us give an illustration by the example of independent data again. We introduce 
the partial sum process 

[un] 

M n (u,9) = -Y>g f(X k ;9), VuG [0,1], 

and consider the gradient vectors M. n (u,9) of M. n (u,9) with respect to 9. Let 9 n 
be the MLE for the full data Xi, ...,X n as a special case of Z-estimators, that is, 
9 n is the solution to the estimating equation 

M n (l,0) = O. 

The fact that the random process 

u ~> y/nM n (u, 9 ) converges weakly to u ~> J(6 i o) 1 ^ 2 -B(m) 

in the Skorohod space D[0, 1], where u ~> B(u) is a vector of independent standard 
Brownian motions, is immediate from Donsker's theorem. However, it does not 
seem so well known that the random process 

u ~> s/nM. n (u, 9 n ) converges weakly to u ~> I(9 ) 1 ^ 2 B°(u) (2) 

in D[0, 1], where u ~> B°{u) is a vector of independent standard Brownian bridges. 
Horvath and Parzen (1994) is apparently the first to introduce the statistic 

T n = n SUp MniuXyTn^niuA) 

ue[o,i] 

for change point problems, where I n is a consistent estimator for the Fisher In- 
formation matrix I(9q). It is immediate from ([2]) and the continuous mapping 
theorem that 

T n ^ d sup ||£»|| 2 . 

u6[0,l] 

Let us call this approach pioneered by Horvath and Parzen (1994) "Z-process 
method" . 

Although Horvath and Parzen (1994) didn't discuss the asymptotic behavior of 
the test under the alternative, Negri and Nishiyama (2011) who took the ^-process 
method for an ergodic diffusion process model based on the continuous observation 
proved also the consistency of the test under an alternative which has sufficient 
generality. Negri and Nishiyama's (2011) argument for alternatives can be applied 
also to the case of independent data. In Section [3] of this paper, we will present a 
generalized version of Z-process method which works also for some cases where the 
limit of the test statistic under the null hypothesis is a functional of a "mixture" 
of standard Brownian motions, where the mixture process, which is something like 
a "partial process of Fisher information", is random and time dependent. We will 
also develop the argument of Negri and Nishiyama (2011) for the consistency under 
alternatives in a more general way. Some new examples will be given. 
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1.3 Notations and examples 

In the rest of this section, we shall list up some examples to which our results can be 
applied. In what follows, the parameter space is a bounded, open, convex subset 
of M. d , where d is a fixed, positive integer. The word "vector" always means u d- 
dimensional real column vector" , and the word "matrix" does u d x d real matrix" . 

The Euclidean norm is denoted by \\v\\ := ^/X^=i \v^\ 2 for a vector v where 

denotes the i-th component of v, and by := ^jYlij=i \^ l '^\ 2 for a matrix A 

where A^' denotes the (i, j)-component of A. Note that \\Av\\ < • ||u|| and 
||AB|| < for vector v and matrices A, B. The notations v T and A T denote 

the transpose. We use also the notation AoB defined by (A o B)^ := A^B^ 
for two matrices A, B (the Hadamard product). We denote by Id the identity 
matrix. The notations — > p and — » d mean the convergence in probability and the 
convergence in distribution, as n — > oo, respectively. 

Example A: Ergodic diffusion process. Let / = (/, r), where — oo < I < r < oo, 
be given. Let us consider an /-valued diffusion process t ~> X t which is the unique 
strong solution to the stochastic differential equation (SDE) 



where s ~> W s is a standard Wiener process. The parameters come from a G &a C 
R dA and (3 G 9 B C R dB , and we denote 9 = (a T ,(3 T ) T . We are supposed to be 
able to observe the process X at discrete time grids = t% < < ■ ■ ■ < t™, and 
we shall consider the asymptotic scheme nA^ — > and t™ — > oo as n — > oo, where 



X t =X + [ S(X s ;a)ds+ [ a(X s -(3)dW t 




A„ = max \tl — i 



ii 



jfe— lh 



and 




71 



as n — y oo. 



(3) 



For Themes I and II, we introduce 



M n (0) =R- 2 l n (l, 6) and M n (u, 9) — R~ 2 L n (u, 6) 



respectively, where 



L n (u,0) 



kit 



\loga(X t n_ i -p) + 



X, 




X^-SiX^-aM-t] 



71 



k-l 




and R n is the diagonal matrix such that Rn'^ is y/t™ for i = 1, dA and y/n for 
% — dA + 1, d with d = dA + d B . 
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The problem to establish the moment convergence for M-estimators in this 
model, where X is a mult i- dimensional diffusion process, was considered by Yoshida 
(2011), and Uchida and Yoshida (2012) relaxed the assumption nA 2 n — > up 
to nA 1 ^ — > 0, where a > 2 is a constant depending on the smoothness of the 
model as was done by Kessler (1995, 1997). To treat the parameters with different 
rates of convergence, in both papers an iterative method is used, and the method 
leads to some "adaptive estimators" that have advantages in applications as their 
simulation results show. In order to explain our core idea clearly, we only consider 
the one-dimensional diffusion process X under the sampling scheme nA^ — > 0. 
Some extension with no interative argument to the case considered in Uchida and 
Yoshida (2012) could be possible. 

Regarding Theme 2, Song and Lee (2009) proposed a statistic for testing the 
existence of a change point of the parameter /3, but the problem to test it for both 
parameters was left as an open problem in their paper (see their Section 5). We 
will give an answer to this problem in Section 14.21 

Example B: Volatility of diffusion process. Let I = (l,r), where — oo < I < r < 
oo, be given. Let us consider an /-valued diffusion process t ~> X t which is the 
unique strong solution to the SDE 

X t = X + [ S(X s )ds+ [ a(X s ;6)dW s , 
Jo Jo 

where s ~> W s is a standard Wiener process. Here, the drift coefficient S(-) is 
treated as an unknown nuisance function. We are supposed to be able to observe 
the process X at discrete time grids = to<t™ = T<oo, and we shall 

consider the asymptotic scheme Q. 
We introduce 

Mn(0) = -L„(1,0) and MJu, 6) = -t n (u, 6) 
n n 

for Themes I and II, respectively, where 

MM) = - E ^ ff( ^ ifl) + ^n^ i f- 

The rate matrix is given by R n = y/nld- 

Iacus and Yoshida (2012) proposed an estimator for the change point in a 
similar model. Our result for Theme II here, dealing with testing the existence 
of a change point, can be applied before statisticians proceed to their theory of 
estimation. 

As we already mentioned, Song and Lee (2009) proposed a statistic for testing 
the existence of a change point in the volatility of an ergodic diffusion process 
under the asymptotic scheme nA® — > and nA b n — > oo for some a > b > 4, by an 
approach which is different from ours. 
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Example C: Cox's regression model. Let a sequence of counting processes t ~» 
N k , k = 1,2,..., which do not have simultaneous jumps, be observed during the 
time interval [0, T\. Suppose that t ~» N k has the intensity 

\ k t (6) = a(t)e 9Tz tY t \ 

where the baseline hazard function a which is common for all fc's is non-negative 
and satisfies f Q a(t)dt < oo, the random process t ~> Zj? is a vector valued covariate 
for the individual k, and the random process t ~> Y t is given by 



1, if the individual k is observed at time t, 
0, otherwise. 



This model was introduced by Cox (1972), and its asymptotic theory was developed 
by Andersen and Gill (1982). 

For Themes I and II, we introduce 

M„(0) = -l n (l,6) and M n (u,6) = -t n (u,e), 

n n 

respectively, where 

L n (u,6) = ]T / (9 T Z k - log S^\e))dN* 
k=i Jo 



with 



S?>°(e) = Y,e eTz <Y t 



fc=i 



The rate matrix is R n = y/nld- 



Example D: Counting process models. Let t ~> N t be a counting process with 
the intensity t ^> X t (9). We suppose that we can observe the processes on the 
compact time interval [0,T n ], and consider the asymptotic scheme T n — > oo. Our 
results may be typically applied to 

M n (0) = — L n (l,0) and M n (u, 9) = ^l n (u, 9), 

where 

ruT n ruT n 

L n (u,6)= / \og\ t (9)dN t - / \ t (9)dt. 
Jo Jo 

The rate matrix is typically R n = y/T^Id. An example of this model is the stress 
release process introduced by Isham and Westcott (1979): 

X t (a,(3,9) = <j ) (at~(3N t ^9). 

It is known that the process t ~> X t := at — j3N t _ is ergodic under some mild 
conditions. A test statistic for the change point problem in this model, which is 
different from ours, has been proposed by Fujii and Nishiyama (2011). 



7 



Example E: Non-linear time series models. Let us consider the time series 
models of the form 

Xk = a(Xk-i, Xk-2, • ••;#) + fe(Xfe_i, Xk-2, • 9)ek, k — 1,2, .... 

Here, is an i.i.d. sequence with i?[£i] = or, more generally, a martingale dif- 
ference sequence with respect to the filtration {Fk)k>o where J-j. = a(X k , X k -i, •••)• 
A possible way to define the estimating functions is 

1 . . l . 

M re (0) = -L n (l,0) and MJu, 9) = -L n (w, 9), 
n n 

where 

The rate matrix is typically given by i? n = y/nld- 

Some detailed discussions for Examples A, B, and C will be given in Sections 
IU |5] and El respectively, while Examples D and E are not discussed in detail here. 

2 Moment convergence of Z-estimators 

Let G be a bounded, open, convex subset of M. d . Let us be given a real valued 
random function L n (0) of 9 G which is twice continuously different iable with the 
gradient vector h n (9) and the Hessian matrix h n (9), defined on a probability space 
(O, J 7 , P) that is common for all n e N. (However, it will be clear from our proofs 
that if the limit matrices V(9 ) and M(9) appearing below are non-random then 
the underlying probability spaces need not to be common for all n £ N.) Let R n 
be a (possibly, random) diagonal matrix whose diagonal components are positive, 
and define Q n by Qn = (-Rn Rn' 3 Using these matrices, put 

M n (0) := iC 2 M#) and M n (0) := Q re o L n (0). (4) 

(In the typical cases, R n = y/nld and Q n = n _1 l, where 1 denotes the matrix 
whose all components are 1.) 

First, we state a theorem to give an asymptotic representation for iv-estimators. 
Although this result is not really novel, we will give a proof for references. 

Theorem 2.1 Consider the above setting. Suppose that there exists a sequence 
of matrices V n (9 ) which are regular almost surely such that for any sequence of 
Q-valued random vectors 9 n converging in probability to 9 , 

M n (9 n ) + V n (9 ) ^ 0. 

Suppose also that 

(R n M n (9 ),V n (9 )) -> d (L(9 ),V(9 )), 
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where L(9q) is a random vector and V(9q) is a random matrix which is regular 
almost surely (we do not assume that V(Oq) and L(9q) are independent) . 

Then, for any sequence of Q-valued random vectors 9 n which converges in prob- 
ability to 9q and satisfies ||i2 n M n (# n )|| = op(l), it holds that 

Rn(9 n -9 ) = V n (9 )- 1 R n M n (9 ) + o P (l) 
^ d V(9 )- 1 L(9 ). 

Remark. In this theorem, the consistency of the sequence of Z-estimators 9 n 
has been assumed. A method to show this property will be given in Lemma 13.11 (i) 
below. 

Now, we give a theorem to establish the moment convergence of Z-estimators, 
which is the main result of this section. 



Theorem 2.2 Consider the setting described in the first paragraph of this section. 
Let some constants p > 1 and a, b > 1 such that ^ + | = 1 be given; see a remark 
at the end of the theorem for the case where we may set a — 1. 
Suppose that 

||i? re M re (0o)|| is asymptotically L pa -bounded. (5) 

Suppose also that there exist a constant 7 G (0, 1] and some random matrices M{9) 
indexed by 9 G such that 



lim E 



sup 

eee 



\RZ(M n (9)-M(9))\r^ 



0. (6) 



Suppose further that either of the following [Ml] or [M2] is satisfied: 

[Ml] There exists a random matrix J(9 ) which is positive definite almost surely 
such that M (9) < -J(6 ) for all 9 G Q, almost surely, and that E[\\J(9 y 1 \\ pb /"'} < 
00; 

[M2] E[sup e€B || M(#) _1 1 \ pb /~'] < 00, where the random matrices M(9)'s are 
assumed to be regular almost surely. 

Then, for any sequence of Q-valued random vectors 9 n such that ||i? n M n (6 l n )|| 
is asymptotically L pa -bounded, it holds that \\R n (9 n — 9q)\\ is asymptotically L p - 
bounded. Therefore, in this situation, whenever we also have that R n (9 n — #0) ~~ > d 
G(9 ) where G(9q) is a random vector, it holds for any continuous function ip 
satisfying (Q]j for q G (0,p) that 

lim E[^(R n (9 n - 9 ))} = E[ij(G(9 ))} 1 

n— >oo 

where the limit is also finite. 

When the last condition in [Ml] is satisfied with \\J{9o)\\~ 1 which is bounded 
or the first condition in [M2] is satisfied with sup eg0 ||M(#) -1 || which is bounded, 
the constant a appearing in the above claim may be replaced by 1 . 
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Remark. The condition [Ml] corresponds to the case p = 2 of the conditions 
[A3] and [A5] in Yoshida (2011), which are 

m(9) - m(6 ) < -x(o Q )\\e - e \\p, e e, 

and high order moment conditions on the positive random variable x($o)~\ where 
"M(0)" should be read as "Y(6) n in Yoshida's (2011) notation. 

Proof of Theorem \2.1[ Recalling (j3J), it follows from the Taylor expansion that 
(R n M n (6 n ))^ = (R n M n (9 ))^ + (M n (9 n )R n (9 n - 9 )) {l \ i = 1, d, (7) 
where 9 n is a random vector on the segment connecting 9$ and 9 n . So we have 

Rn(0 n — do) = A Tl + B n R n (9 n — 9 ), (8) 

where 

A n = V n (9 o y l K(Mn(9o)-Mn(dn)), 

B n = V n (9o)- l (M n (9 n ) + V n (9 )). 

It follows from the extended continuous mapping theorem (e.g., Theorem 1.11.1 
of van der Vaart and Wellner (1996)) that K^o)" 1 ^(#o)~\ thus we have 
1 1 Ai| | = Op(l) and ||J3 n || = op(l). It therefore holds that 

\\Rn(9 n - 9 )\\ < O p (1) + o P (l) • \\R n (9 n - 9 )\\, 

which implies that \\R n (9 n — 9 )\\ = Op(l). Hence, going back to (jSJ) we obtain 

Rn{9 n - 9 ) = A + op(1) = V n {9o)- 1 R n M n {9 ) + o P (l). 

The last claim is also a consequence of the extended continuous mapping theorem. 
The proof is finished. □ 

Proof of Theorem \2.2 . We will give a proof for the case where [Ml] is assumed. 
The proof for the case where [M2] is assumed is similar (and simpler), so it is 
omitted. 

Due to (EJ) again, we have 

R n {9 n - 9 ) =C n + (D^ + D^)R n {9 n - 9 ), 

where 

C n = J(9 )~ 1 R n (Mn(9 )-M n (9 n )), 
= J(9 y l (M n (9 n )-M(9 n )), 
D n 2) = J(9 y 1 (M(9 n ) + J(9 )) 1 
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where 9 n is a random vector on the segment connecting 6q and 9 n . 

From now on, we consider the case 7 £ (0, 1); the proof for the case 7 = 1 

(2) 

is easier, and it is omitted. Since —Dn is non-negative definite almost surely, it 
follows from Minkowski's and Holder's inequalities that 

(E[\\R n (6 n - 6 W})^ 

< {E[\\{i d -DW)R n {e n -e W]) l /*> 

< o(i) + 0(1) • (E[\\Rlr(e n - fl )|| p/(1 - 7) ]) (1 - 7)/p , 

where we have used Holder's inequality again to get 

E[\\c n \\ p ] < (^[HJ^-'irD^^oi^^^o) -M n (e n ))\n 1/a 

and 

E[\\RZD^\\ ph ] < (^[||^(6'o)- :L |r fo/ ^]) :L/fo (^[||^(IVir^(^) — TV^C^.))!!^/^) 1 /-; 

if ll^/f^o)!^ 1 is bounded, we can get this kind of bounds with a = 1. 
Notice that 

lli^m.-0n)l| 1/(1 - 7) 



< 



\ 



d 



d(l/(l-7))-l \R { n l) \ 2 ffi ] ~ ef\ 2 /^ 

i=l 

< i| J R n (^-^ o )n- c / i /( 2 "^-|p(0)r/( 1 -^, 

where V(Q) denotes the diameter of 0. So we obtain 



(E[\\R n (9 n ~ )\\ p ]) 1/p 

< 0(1) + o(l) • (E[\\R n (6 n - fl )IH) (1 - 7)/p 

< 0(1) + o(l) • (E[\\R n (9 n - e )\\p] v 

which yields that 

E[\\R n (9 n - 9 )\\ p ] < 0(1) + o(l) • E[\\R n (9 n - 9 )\\ p ]. 
Therefore, \\R n (6 n — 6 )\ \ is asymptotically L p -bounded. □ 



3 Z-process method for change point problems 

Let _D[0,1] be the space of functions defined on [0,1] taking values in a finite- 
dimensional Euclidean space, which are right continuous and have left hand limits; 
we equip this space with the Skorohod metric. Throughout this section, all random 
processes, denoted as u ~> X(u), are assumed to take values in D[0, 1]. 
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Let be a bounded, open, convex subset of ~R d . For every n G N, let 
u ~» h n (u,9) be a real valued random process indexed by G 6, defined on a 
probability space J 7 , P) that is common for all n G N. (However, it will be 
clear from our proofs that the underlying probability spaces do not have to be 
common for n G N if the objects Mg (u, 8), Ai(u, 0) and V(u, 9q) appearing in the 
limit below are non-random.) We suppose that for every u G [0, 1] the random 
function 9 i— > L n (w,0) is two times continuously differentiate with the gradient 
vector L„ (m, 9) and the Hessian matrix L n (w, 9). Let a (possibly, random) diagonal 
matrix R n whose diagonal components are positive be given, and define Q n by 
Q^ j) = (R^R^)- 1 . Using these matrices, put 

M n (u, 9) := R- 2 l n {u, 9) and M n (u, 9) := Q n o l n (u, 9). 

We consider the following testing problem: 

H : the true value 9 G 6 does not change during u G [0,1]; 
Hi. "not H ". 

The meaning of "not H " will be precisely specified in the condition [A] below. 
Let us describe some properties which the "limits" M 9o (u,9) and Ai(u,9) of the 
random vectors M. n (u,9) under H and under Hi, respectively, have to satisfy. 

[N] Under H , it holds that 

sup||M n (l,£)-M eo (M)||^ p 0, (9) 

6>G0 

where the limits Mg (l, 9)'s satisfy that 

inf 11^(1,0)11 > = 11^(1,00)11, almost surely, Ve > 0. (10) 

6: | \0— 6q 1 1 >£ 

[A] Under H u it holds that 

sup sup||M n (M,0) -M{u,9)\\ -> p 0, (11) 
ug[o,i] fee 

where the limits A4(u,9) 7 s satisfy that there exists a 0- valued random vector 9* 
such that 

inf |LM(1,0)|| >0 = ||A^(1,0*)||, almost surely, Ve > 0, (12) 

0:||0-0*||>e 

and that 

sup 0*)|| > 0, almost surely. (13) 

«e(o,i) 

Assuming the conditions Q), (|T0|) . (fTTT) and (fl2"l) is natural; see e.g. Theorems 
5.7 and 5.9 of van der Vaart (1998). Let us explain how to check (TlBl) in the most 
typical form of alternatives in the change problems: 

H[: there exists a constant G (0, 1) such that the true value is 0o G O for 
u G [0, it*], and 0i G for u G («*, 1], where 0q ^ B\. 
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In many cases of "ergodic models", under H[, the condition (fTTj) is satisfied 
with Ai(u, 9) such that 

M(u*,9) =u*Me (l,9) and M(l,9) = u*M eo (l,6) + (1 - u*)M 9l (l, 5), 

where not only Mg o (l,0)'s but also 71^(1, 0)'s are assumed to satisfy ffTUl) with 
trivial change of notation. To see that the condition f fl3|) is satisfied, notice that 

M(u», 0*) = M{u*, 9*) - u*M(l, 0*) = u.(l - it*)(M eo (l, 0,) - M ei (l, 9*)); 

if this were zero with positive probability, then it should follow from -M(l, 0*) = 
that Me (l,9*) = 71^(1,0*) = with positive probability, and this contradicts 
with fflOl) and the assumption that O ^ Q\. Therefore, we have 

sup \\M{uX)\\ > ||-M(«*A)|| 

w6(0,l) 

= w*(l - u*)||M0 o (0*) - M 6l (9*)\\ > 0, almost surely. 

This positive value is closely related to the power of our test under H[. 

Now, we prepare a lemma to prove the consistency of a sequence of Z-estimators. 
This lemma can be proved exactly in the same way as Theorems 5.7 and 5.9 of van 
der Vaart (1998), so the proof is omitted. 

Lemma 3.1 (i) Under [N], for any sequence of Q -valued random vectors 9 n such 
that ||M n (l,0 n )|| = op(l), it holds that 9 n -> p 9 . 

(ii) Under [A], for any sequence of Q-valued random vectors 9 n such that 
||M n (l,0 n )|| = o P {\), it holds that 9 n -> p 0*. 

We are ready to state our main result of this section. 

Theorem 3.2 Consider the above situation. Let 9 n be any sequence of Q-valued 
random vectors such that | |i? n M n (l, 9 ' n )|| = op(l) under H and Hi. Letu~~> V n (u) 
be any sequence of matrix valued random processes, which are regular except for 
u = almost surely, and it should be a uniformly consistent sequence of estimators 
for the non-negative definite matrix valued random process u ~> V(u, 9q) appearing 
below under H . Introduce the test statistic 

T n = sup (i? n M n (M,^)) T (wK(w) _1 )^M n ( M ,^)- 
«e(o,i] 

(%) Under [N], suppose that there exists a sequence of matrix valued random 
processes u V n (u, 9q) such that V n (l,9o)'s are regular almost surely and that 
for any sequence of Q-valued random vectors 9 n {u) indexed by u £ [0, 1] satisfying 
su P«e[o,i] \ \0n(u) - 9 \ \ -> p 0, 

sup \\M n (u,8 n (u)) + V n (u,9 )\\ ^ p 0. (14) 
ue[o,i] 
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Suppose also that 

(R n M n (u, 9 ), V n (u, B Q )) ^ d ((«- V(u, £o)) 1/2 5(u), O )), in D[0, 1], (15) 

where u V(tt, #o) ° non-negative definite matrix valued random process such 
that V(l,#o) is positive definite almost surely, and u ~» B(«) is a vector of inde- 
pendent standard Brownian motions; the value of the first vector of the limit in ( TI^)) 
at u = should be read as zero. (In general we do not assume that u ~» V(w, #o) 
and n ~» B{u) are independent.) 

If sup ue [ 01 ] ||V^(it) — V(it, 0o)|| 0, then it holds that 

T n ^ d sup ||S(«)-tt 1 / 2 y(« ) fl D ) 1 ^(l ) e r 1/a B(l)|| a . (16) 
ue[o,i] 

Therefore the test is asymptotically distribution free ifV(u,9 ) = uV(l,9 ) for 
every u G [0,1], because the limit in this case is reduced to sup u6 r 01 i ||-B°(n)|| 2 
where u ~> B°{u) = B{u) — uB(l) is a vector of independent standard Brwonian 
bridges. In the general case, if u ^ V(u,0q) and u ~> B(u) are independent, then 
the limit in ( T76J) is approximated by 

sup \\B(u) -u^VM^Vniir^B^ll 2 , 

«6[0,1] 

whose approximate distribution can be computed by some computer simulations for 
the standard Brownian motions u ~> B{u). 

(ii) Under [A], it holds for any random point u in (0, 1) that 

T n > \{uR 2 n V n {u)- 1 ) [WM^O^W 2 + 0P (1)} , 

where A (A) denotes the smallest eigenvalue of the random matrix A. Hence, if 
there exists a random point u in (0, 1) such that \ \M.(u, 9*)\\ > almost surely and 
that A(i? 2 V n (u)~ l ) tends to 00 in probability, then the test is consistent. 

Remark. In the typical cases of ergodic models, the matrix V(u, 9q) is actually 
uI{9q) where I(9q) is the Fisher information matrix. Hence V(u, #0) = uV(l, 9q) 
holds, and the reult is reduced to the standard case. 

Proof. First let us prove (i). By Lemma [3. II (i) we know that 9 n is a consistent 
estimator for 9q under H . So it follows from Theorem 12. II that 

R n M n (u,9 n ) 

= R n M n (u, 9 ) + M n (u, 9 n {u))R n {9 n - 9 ) 
= R n M n (u, 9 ) - V n (u, 9 )V n (l, 9 )- 1 R r M n (l, 9 ) + e n (u) 
^ d {u- x V{u, 9 )f' 2 {B{u) - u x l 2 V{u, 9 ) 1 / 2 V(1, 9 )^ 2 B(1)), in D[0, 1], 

where 9 n (u) is a random vector on the segment connecting #0 and 9 n , and the 
reminder terms e n (u) appearing above satisfy that sup u6 r 01 i ||e n (u)|| — > p 0. As a 
result the claim (i) follows from the continuous mapping theorem. 
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The inequality in (ii) is proved as follows: 



> a^k^)- 1 )!!^^,^)]! 2 



\(uR 2 n V n {u)~ l ) \\\M{u,9 



op(l] 



The proof is finished. 



□ 



4 Example A: Ergodic diffusion process 



Recall the description of Example A in Section 11.31 where the first (^-components 
a of the parameter 9 = (a T , /3 T ) T is involved in the drift coefficient, and the latter 
(/^-components /3 is in the diffusion coefficient. Recalling also the definition of the 
rate matrix R n there, let us consider the (d,A + (/^-dimensional random vectors 
M n (u, 6) and the {d^ + <^b) x {d& + (/^-random matrices M n {u, 9) given as follows: 



M n (u,9) = (MA(u,9) l ,M»(u,9) 



T\T 



M n (li,i 



M*(u,9) M°(u,0) 
MZ(u,9) T M*(u,9) 



Below, we will use the following notation: for a given constant p > 1 and a 
given sequence of positive constants r n , 



r n S[||e„|H-^0. 



(17) 



Notice that £ n = oj\/(i)(^n 1 ) implies that £ n = op{r~ l ). 

Under some regularlity conditions which are usually assumed in the asymptotic 
theory for ergodic diffusion process models, it is standard to show the followoing 
facts (see e.g. the appendix of Kessler (1997) for some techniques needed to prove 
them; see Nishiyama (2011), in Japanese, for the detailed proofs of the techniques 
that are omitted in Kessler's (1997) appendix): 



sup 

uG[0,l] 



(u, 9n 



1 



E 



(Wfn - W t n_ 



°M(p)((tn 



sup 

uG[0,l] 



E 



k:t"_ 1 <ut 




\j.n -in 

\ L k L k-l\ 



o M ( P )(n 1/2 ), 



sup sup 

ue[o,i] fee 



^ HA i X il-V e ^ e )\ t k-' t k-l\ 



OM(p)((t") 



n\-X/2\ 
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sup sup 

ug[o,i] fee 



E(u,i 



where 



- HB ( X t^ d o, 



OM(p)(n 



-1/2-n 



sup sup ||M£(u,0)|| = o A / (p )(n- 1/4 ), 
«e[o,i] eee 



S^x; a)(S , (x; «o) _ ^(^S «)) — S(x; a)S(x; a) 



a(x; (3) &(x; f3)&(x; (3) 



a(x-pf a(x-py 
a(x;f3)a(x;f3) T 



(a(x;/3 ) 2 -a(x;/3) 2 ) 



The regularity conditions for the above claims depend on the constant p > 1 
appearing in "o M ( P )(?V 



fr,„ 1 )" which we need to have. 



4.1 Moment convergence 

The assumption §5§ can be checked by applying Burkholder-Davis-Gundy's in- 
equality to the main part of -R n M n (0 o ) = -R n M n (l, 8 ). On the other hand, noting 
also < n, we can apply Remark 1 (ii) of Uchida and Yoshida (2012) to show 
that the assumption ([6]) for M n (0) = M n (l,#) is satisfied for 



M(B) 



M A (6) 
M B (9) 



with 



M A (0) = j^H A (x;9 ,9)fi eo (dx) and M B (9) = Jh B (x; do, 9)ne (dx), 

where ne denotes the invariant distribution of X when the true value is 9q. In order 
to make the assumption [Ml] or [M2] fulfilled, we have to introduce the parametric 
model for the drift and diffusion coefficients nicely. An example for which the 
assumption [Ml] can be easily checked is S(-;a) = a T a(-) and <r(-;/3) = b ('\ 
where a(-) and b(-) are some vectors of known functions, assuming that b(-) is 
bounded. The assumption [M2] would be satisfied in more general parametric 
models, because M(6 l )'s are non-random in this example. 

4.2 Change point problem 

Under some standard conditions on the parametric family for the drift and dif- 
fusion coefficients in the context of ergodic diffusion processes, we can show that 



1(3 



the condition © under H is satisfied with M 0O (1,0) = (M^(l, 6) T , Mg{\, 9) T ) T , 
where 

and 

and that the condition ffTT]) under ifj is satisfied with 

.M(«,0) = (iiA«,)M 9b (l,fl) + ((«-«*) V0)M fll (1,0). 

As stated there, the condition (fT3|) is automatically satisfied as soon as the natural 
conditions (TTOT) and (TT^T) are satisfied. 

Using the facts which we presented at the beginning of this section and the 
usual martingale central limit theorem, we can see that the condition ffl^|) and 
(TT5D hold for 

v„ A (u,e ) o 



where 



V ^ u ^=\ n o V n B (u,0 o ) 



n(u ' o) " f» ^ <r(x t » ;A,) 2 

K OA) = - > 



n ^ <j(X t n • 8 ) 

The limit of V n (u, 9 ) is V(u, 9 ) = uIq (9o), where 

hM = 

with 



ll{0) 



S(x;a)S(x;a) T B f &(x; (3)&(x; (3) 



\T 



We suppose that Je (6 l )'s are positive definite. 

As a consistent estimator V n (u) for V^w, #o), we introduce 



V n {u) 

where 



«4 A o 

ul? 
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Since V(u,9q) = uV(1,9q) in this example, the limit of our test statistic T n is 
su Piie[o 1] ll-^°( w )ll 2 where u B°(u) is a vector of standard Brownian bridges, so 
the test is asymptotically distribution free. 

Finally, it is clear that under H[, A(i?^y n (n*)~ 1 ) tends to oo in probability 
since the matrix u*Ig o (0*) + (1 — u^Ie^O*), which is the limit of V n (u*), is positive 
definite. Thus the test is consistent. 

4.3 Numerical study for change point problem 

In this section, as well as Section 15.31 for Example B, we observe finite sample 
performance of our test statistic through numerical experiments. Here, we adopt 
the Ornstein-Uhlenbeck process starting from xq = for the true (data-generating) 
process: 

X t = x - aXJs + PW U te[0,T]. (18) 
Jo 

For simplicity, we shall treat the equidistant sampling case, that is, A n = \t^ — ijL-J 
for every k — 1, ...,n. 

We are going to observe the trajectory of the process (118j) for different time 
horizons t™ = T, and the number n of observations for each trajectory is such that 
£™ = n 1 / 3 , so A„ = n -2 / 3 . For this process ffl8|) the estimators for the parameters 
a and (3 and the estimator of the information matrix can be explicitly calculated, 
and thus the test statistic can be easily computed. For any fixed level e > the 
critical value c £ is given by 

(d.A+d B \ 
sup V \B°^\u)\ 2 >c e ) =e. 
«6[0,1] J 

Table 1 of Lee et al. (2003) gives a table of the critical values for the significance 
levels e = 0.01,0.05,0.10 and for different values of the dimension d = + ds 
computed by Monte Carlo simulation for the limit distribution. Throughout we 
take the significance level to be e = 0.05. For two parameters (d = 2) the critical 
value is c £ = 2.408. Regarding the null hypothesis we generate M = 10 4 trajectory 
of (1181) and we evaluate the empirical size. The results are reported in Table [U We 
observe that: the empirical size gains along with increasing terminal time T — t™, 
attaining at 0.05, but also for small terminal T. In the second example reported 
in Table [H the values of the parameter are the maximum likelihood estimate for 
the mostly federal funds data 1963-1998 in Ait-Sahalia (1999). 

Regarding the alternative hypothesis we study the behavior of the test statistic 
in three different situations and for different change point w*T of the parameters, 
as follows: 

• The drift coefficient changes from a to a±, but the diffusion coefficient does 
not change. 

• The drift coefficient does not change, but the diffusion coefficient changes 
from (3 to 
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T 


5 


10 


15 


20 


25 








125 


1000 


3375 


8000 


15625 


a = 


1, 


ft = l 


0.044 


0.054 


0.050 


0.052 


0.053 


a = 0.25, 


ft 


= 0.02 


0.047 


0.061 


0.058 


0.064 


0.054 



Table 1: Empirical size based on M = 10 4 independent statistics, for different time 
horizons. 



• Both coefficients change. 
For each of the above scenarios we consider the following change points, u* = 

1 3 _9_ 
2' 4' 10" 

The first scenario is the worst case for the diffusion, and in order to detect a 
change in the drift we have to observe the process as long as possible. Table |2] 
shows empirical power for different terminal times T and different change points 
7i*T. The values of the parameters are ao = 0.25, «i = 0.50 and /3 = 0.02. The last 
does not vary. We simulate 10 4 independent copies of a trajectory of ( TT8"j) to obtain 
different values of T n . The power increase as T increase and the performance is 
better when we can observe the process after the change for long time (the case 
m* = |). In such a case the power of the test is reasonable. In the worst case 
u * = wi ^ ne test is not able to detect the change in the drift coefficient. 



T 

n 


5 

125 


10 

1000 


15 

3375 


20 

8000 


25 

15625 


U - i 
u * 2 


0.31 


0.52 


0.73 


0.79 


0.88 


u * 4 


0.12 


0.17 


0.23 


0.26 


0.35 


"* 10 


0.05 


0.07 


0.08 


0.08 


0.09 



Table 2: Empirical power based on M = 10 4 independent statistics. Here the 
significance level is 0.05. The values of the parameter are ao = 0.25, a\ = 0.50 
and (3 = 0.02 (it does not vary). 

Table |3] reports the results for simulation when only the drift changes, but the 
change is bigger. With «o = 0.25, ot\ = 1.25 and (3 = 0.02, the power increases 
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not only for u* — ^ but also for u* = |. This was expected, but as in the previous 
example the performance of the test is not good when the chance of the parameter 
occur at the end of the observation window. 



T 


5 


10 


15 


20 


25 


n 


125 


1000 


3375 


8000 


15625 


u * = 2 


0.35 


0.60 


0.78 


0.88 


0.94 


u = 5 

u * 4 


0.13 


0.20 


0.28 


0.31 


0.38 


"* 10 


0.06 


0.08 


0.09 


0.11 


0.11 



Table 3: Empirical power based on M = 10 4 independent statistics. Here the 
significance level is 0.05. The values of the parameter are a = 0.25, a± = 1.25 
and /3 = 0.02 (it does not vary). 



Table H] shows the empirical power for different terminal times T and different 
change points in the second scenario: a = 0.25 does not vary, but (3 = 0.02 
changes and becomes (3i = 0.03. The power of the test is very good, but this is 
not surprising because a change in the diffusion coefficient can be easily detected. 
The situation reported in Table [5] is the same: also for very small change in the 
diffusion coefficient, the performance of the test is very good with the empirical 
power that reaches the value 1 also for small T. We do not report the results for 
the third scenario where the drift changes at the same instant of the diffusion, 
because the performance of the test is the same as in the second scenario. This is 
not surprising and is due to the different rates of convergence of the estimators of 
the two parameters. 



5 Example B: Volatility of diffusion process 



Recall the description of Example B in Section 11.31 An interesting point of this 
example is that the limit of — M. n (u, 6 n (u)) is random and depend on u G [0, 1] in 
a complex way. 

Let a constant p > 1 be given, and recall the notation (II 7p . Under some 
regularity conditions, it holds that 



sup 

uG[0,l] 



KM)-- V] 

n z — ' 



aiXq ^Oo) I \Wq - W t n_ 




-in 



OM{p) 



n 



-1/2-1 



20 



T 

n 


5 

125 


10 

1000 


15 

3375 


20 

8000 


25 

15625 


w* 


1 

2 


0.99 


1 


1 


1 


1 




3 
4 


0.86 


1 


1 


1 


1 


u* - 


9 
10 


0.36 


0.99 


1 


1 


1 



Table 4: Empirical power based on M = 10 4 independent statistics. Here the 
significance level is 0.05. The values of the parameter are a = 0.25 (it does not 
vary), /3 = 0.02 and ft = 0.03. 



where 



sup sup 

ue[o,i] 0ee 



H(x;8n 



[u, 



cr(x: 



OM(p)[n 



-l/2^ 



cr(x 



alx 



(a(x;9 ) 2 



-2- 



a(x\ 



The regularity conditions for the above claims depend on the constant p > 1 which 
we need to have. Moreover, under some standard conditions, it holds that for any 
sequence of random vectors 6 n {u) indexed by a 6 [0,1] such that sup ug [ ^ | \9 n (u) — 
9 \\^ p 0, 

sup \\M n (u,e n (u)) + v n (u,e )\\ ^ p o, 

u6[0,l] 

where 

V n {uM = ~ £ Y W » V«6 [0,1]. 

Also, it follows from the well known theory of martingales that 

(V^M n (Mo),KMo)) -> d ((iTV(Mo)) 1/2 £(«),^(Mo)) in D[0,1], 

where m ~» -B(it) is a vector of independent standard Brownian motions which is 
independent of the matrix valued random process u ~> V(u, #o) given by 



V(Mo) 



uT 



&(X 8 ;6 )&(X a ;6 
a(X s ;6 ) 2 



-ds, VnG [0,1]. 
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T 

n 


5 

125 


10 

1000 


15 

3375 


20 

8000 


25 

15625 


u* 


1 

2 


0.87 


1 


1 


1 


1 


it* 


3 
4 


0.52 


0.99 


1 


1 


1 




9 
10 


0.14 


0.62 


0.99 


1 


1 



Table 5: Empirical power based on M = 10 4 independent statistics. Here the 
significance level is 0.05. The values of the parameter are a = 0.25 (it does not 
vary), /3 = 0.020 and f3 1 = 0.025. 



5.1 Moment convergence 



Due to the above facts (for u 
estimator 9 n for 9q satisfying 



1 ) , Theorem 12.11 yields that for any consistent 



op(n 1 ' 2 ) we have \fn[9 n — 9 ) 



V(9 )~ 1 / 2 Z, where Z is a standard Gaussian random vector which is independent 

ofv(e ) = v(i,e ). 

Next let us apply Theorem l2.2l The assumption (jSJ) for y/nM n (9 ) = y/nM n (l, 9 ) 
can be checked by using Burkholder-Davis-Gundy's inequality. In the case of this 
example, checking that the assumption ([6]) for M n (#) = M n (l,6 l ) is satisfied with 



M(9) 



T 



H(X t ;e ,6)dt 



is easy. In order to make the assumption [Ml] or [M2] fulfilled, we again have to 
introduce the parametric model for the diffusion coefficients nicely. An example 
for which the former assumption in [Ml] can be easily checked is cr(-; 9) = e e gly '\ 
where g(-) are some vectors of known, bounded functions. The latter assumption 
in [Ml] is then reduced to 



E 



g(X t )g(X t ) T dt 



pb/y 



< OO, 



for which we can give a clear sufficient condition for the function g at least in the 
one-dimensional case (for example, just assume |<?(-)| 2 > c for a constant c > 0). 
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5.2 Change point problem 

Under some standard conditions on the parametric family for the diffusion coeffi- 
cient, we can show that the condition Q under H is satisfied with 

M 6o (l,e) = r a ~^% (a(X t ; 9 ) 2 - a(X t ; 9) 2 )dt. 

Jo <r{x t ) oy 

Under H[, we have 

sup \\Mn(u,0) -M(u,9)\\ ^ p 0, 

u€[0,l] 

where 

r(uAu*)T ■/ y- . n\ 

M ^°) = / } Y u a h (r(Xt,9 ) 2 -a(Xt,9) 2 )dt 
Jo v{X t ;U)° 

+l{u > u*} f a ^h a(X t] 9,) 2 - a{X t - d) 2 )dt. 

We can give a set of sufficient conditions for f|T3|) as follows. Suppose that the 
Lebesgue measure of the random set T = {t G [0,T] : &(X t ; 9^)/a(X t ; 9*) 3 = 0} 
is zero almost surely, which is true in many concrete models. In this case, replace 
the values &(X t ; 9*)/a(X t ; #*) 3 in the definition of M(u, 0*) on the set T by 1 
to construct A4(u,9*)* which equals with the original J\4(u, 9*) for all u G [0,1], 
almost surely. If we further assume that for any non-empty interval J G I 

a(x;9) = a(x;9'), ViG J <^=> 9 = 9' 

and that each of the random sets Jo = {X t : t G [Oji^T]} and J-y = {X t : 
t G (w*T, T] } includes a non-empty set almost surely, then it follows from the 
assumption 9 ^ 9\ that J^-M (u, 9*)* ^ almost surely. Thus we have 

sup ||.M(it, = sup ||.M(it, > 0, almost surely. 

ue(o,i) «e(o,i) 

Now, consider the matrices 

Jo o-{A t ,9y 

Let us assume that for every 9 G there exists a set Jq G I such that the 
Lebesgue measure of Jq is zero and that the matrices &(x; 9)&(x; 9) T /a(x; 9) are 
positive definite for x G J$, which is a standard assumption. In this case, if the 
claim that the Lebesgue measure of the set {t G [0,T];X t (oj) G J} is positive for 
any set J C / such that the Lebesgue measure of J c is zero holds for almost all 
u, then V(u, o )'s and V(u, 0*)'s for u G (0,1] are positive definite almost surely 
under H and H[, respectively. 
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As we saw at the beginning of this section, the conditions (TBI) and ([15]) under 
H are satisfied. As a consistent estimator V n (u) for V(u,9q) is given by 

Vn(u) = ^ W J V" . V«G [0,1]. 



Our test in this example is not asymptotically distribution free. 

Finally, it is clear that under H[, nA(V n (u) _1 ) tends to oo in probability, because 
it follows from what we have assumed that A(V^(tt) -1 ) — > p X(V(u, and the 

limit is positive almost surely. Thus the test is consistent. 



5.3 Numerical study for change point problem 

The data-generating process is the following: 

X t = 4 - J (X s - A)ds + J exp {^Y^X^) dWs ' 1 G [ °' 1] ' 

where the drift coefficient S(x) = — (x — 4) is treated as a nuisance function. Sup- 
pose that we observe M = 10 3 independent copies of this process at the equidistant 
time grid — -, k — 0, 1, ...,n. We compute the critical value of the test based 
on the approximation of the limit distribution 



sup \B(u) -n 1 /V(u,^o) 1/2 ^(l,^o) _1/2 5(l)| 2 
ite [o,i] 



(19) 



obtained by replacing 



by the natural estimator 



v(u,e 



ru 


x 2 s 


Jo 


i + xi 



ds 



V n {u) 



r, [««] 

-E 



Xj-n 
L k-1 



1 + A, 2 „ 



and doting 10 3 times Monte Carlo simulation for the standard Brownian motion 
u ~> B{u). 

The empirical size under Ho is reported in Table El where the true value of the 
parameter is set as = 1.0 or 1.5. We see that the convergence to the approximate 
distribution of ( II 9p is not perfectly good, but it is reasonable even for the cases 
where n is small. 

The empirical power under H[ is reported in Table [TJ where the true values of 
the parameter change from 9q = 1.0 to Q\ = 1.5 at time point it* = ^, | or ^. 
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n 


20 


40 


100 


200 


9o = 


1.0 


0.026 


0.024 


0.042 


0.040 


9o = 


1.5 


0.026 


0.023 


0.037 


0.034 



Table 6: Empirical size based on M = 10 3 independent statistics. Here the signif- 
icance level is 0.05. The value of the parameter is 0q = 1.0 or 1.5. 



n 


20 40 100 200 


U * = 2 


0.067 0.331 0.755 0.946 




0.125 0.255 0.630 0.873 


"* 10 


0.048 0.117 0.275 0.462 



Table 7: Empirical power based on M = 10 3 independent statistics. Here the 
significance level is 0.05. The values of the parameter change from 6 = 1.0 to 
9 1 = 1.5 at time ii* = |, | or j^. 



6 Example C: Cox's regression model 



Recall the description of Example C in Section 11.31 Since all the arguments are 
similar to those in Section [U we state only the key points in the discussion on the 
change point problem. 

Introducing the notations 



we suppose that 



St ,2 (0) 



k=l 
n 

j2zy^Y t \ 

k=l 
n 



k=l 



sup sup 

eee te[o,T] 



±Sf{0)-S\{0) 



^ p 0, Z = 0, 1,2, 
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where the limits t ~> S l t are some stochastic processes (c.f. Andersen and Gill 
(1982) who assumed that S l, s are not random). 

Then, some arguments similar to Section 15.21 are possible for 

T/r m ( iuAU * )T g(g)g(g) ~ si{B)si{ey ^ 

V{u,6) = J S t (9 )a(t)dt 

+i{u>u4 j so(eY S t (0i)a(f)df, 

T? m i r T ^°(^)^' 2 (^) - g^fog^fer , vfc 
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